152 research outputs found

    Querying and Merging Heterogeneous Data by Approximate Joins on Higher-Order Terms

    Get PDF

    Using ILP to Identify Pathway Activation Patterns in Systems Biology

    Get PDF
    We show a logical aggregation method that, combined with propositionalization methods, can construct novel structured biological features from gene expression data. We do this to gain understanding of pathway mechanisms, for instance, those associated with a particular disease. We illustrate this method on the task of distinguishing between two types of lung cancer; Squamous Cell Carcinoma (SCC) and Adenocarcinoma (AC). We identify pathway activation patterns in pathways previously implicated in the development of cancers. Our method identified a model with comparable predictive performance to the winning algorithm of a recent challenge, while providing biologically relevant explanations that may be useful to a biologist

    Probabilistic (logic) programming concepts

    Get PDF
    A multitude of different probabilistic programming languages exists today, all extending a traditional programming language with primitives to support modeling of complex, structured probability distributions. Each of these languages employs its own probabilistic primitives, and comes with a particular syntax, semantics and inference procedure. This makes it hard to understand the underlying programming concepts and appreciate the differences between the different languages. To obtain a better understanding of probabilistic programming, we identify a number of core programming concepts underlying the primitives used by various probabilistic languages, discuss the execution mechanisms that they require and use these to position and survey state-of-the-art probabilistic languages and their implementation. While doing so, we focus on probabilistic extensions of logic programming languages such as Prolog, which have been considered for over 20 years

    Using a logical model to predict the growth of yeast

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A logical model of the known metabolic processes in <it>S. cerevisiae </it>was constructed from iFF708, an existing Flux Balance Analysis (FBA) model, and augmented with information from the KEGG online pathway database. The use of predicate logic as the knowledge representation for modelling enables an explicit representation of the structure of the metabolic network, and enables logical inference techniques to be used for model identification/improvement.</p> <p>Results</p> <p>Compared to the FBA model, the logical model has information on an additional 263 putative genes and 247 additional reactions. The correctness of this model was evaluated by comparison with iND750 (an updated FBA model closely related to iFF708) by evaluating the performance of both models on predicting empirical minimal medium growth data/essential gene listings.</p> <p>Conclusion</p> <p>ROC analysis and other statistical studies revealed that use of the simpler logical form and larger coverage results in no significant degradation of performance compared to iND750.</p

    Application of deep learning in detecting neurological disorders from magnetic resonance images: a survey on the detection of Alzheimer’s disease, Parkinson's disease and schizophrenia

    Get PDF
    Neuroimaging, in particular magnetic resonance imaging (MRI), has been playing an important role in understanding brain functionalities and its disorders during the last couple of decades. These cutting-edge MRI scans, supported by high-performance computational tools and novel ML techniques, have opened up possibilities to unprecedentedly identify neurological disorders. However, similarities in disease phenotypes make it very difficult to detect such disorders accurately from the acquired neuroimaging data. This article critically examines and compares performances of the existing deep learning (DL)-based methods to detect neurological disorders—focusing on Alzheimer’s disease, Parkinson’s disease and schizophrenia—from MRI data acquired using different modalities including functional and structural MRI. The comparative performance analysis of various DL architectures across different disorders and imaging modalities suggests that the Convolutional Neural Network outperforms other methods in detecting neurological disorders. Towards the end, a number of current research challenges are indicated and some possible future research directions are provided

    Automated Discovery of Food Webs from Ecological Data Using Logic-Based Machine Learning

    Get PDF
    Networks of trophic links (food webs) are used to describe and understand mechanistic routes for translocation of energy (biomass) between species. However, a relatively low proportion of ecosystems have been studied using food web approaches due to difficulties in making observations on large numbers of species. In this paper we demonstrate that Machine Learning of food webs, using a logic-based approach called A/ILP, can generate plausible and testable food webs from field sample data. Our example data come from a national-scale Vortis suction sampling of invertebrates from arable fields in Great Britain. We found that 45 invertebrate species or taxa, representing approximately 25% of the sample and about 74% of the invertebrate individuals included in the learning, were hypothesized to be linked. As might be expected, detritivore Collembola were consistently the most important prey. Generalist and omnivorous carabid beetles were hypothesized to be the dominant predators of the system. We were, however, surprised by the importance of carabid larvae suggested by the machine learning as predators of a wide variety of prey. High probability links were hypothesized for widespread, potentially destabilizing, intra-guild predation; predictions that could be experimentally tested. Many of the high probability links in the model have already been observed or suggested for this system, supporting our contention that A/ILP learning can produce plausible food webs from sample data, independent of our preconceptions about “who eats whom.” Well-characterised links in the literature correspond with links ascribed with high probability through A/ILP. We believe that this very general Machine Learning approach has great power and could be used to extend and test our current theories of agricultural ecosystem dynamics and function. In particular, we believe it could be used to support the development of a wider theory of ecosystem responses to environmental change

    Effects of Fusion between Tactile and Proprioceptive Inputs on Tactile Perception

    Get PDF
    Tactile perception is typically considered the result of cortical interpretation of afferent signals from a network of mechanical sensors underneath the skin. Yet, tactile illusion studies suggest that tactile perception can be elicited without afferent signals from mechanoceptors. Therefore, the extent that tactile perception arises from isomorphic mapping of tactile afferents onto the somatosensory cortex remains controversial. We tested whether isomorphic mapping of tactile afferent fibers onto the cortex leads directly to tactile perception by examining whether it is independent from proprioceptive input by evaluating the impact of different hand postures on the perception of a tactile illusion across fingertips. Using the Cutaneous Rabbit Effect, a well studied illusion evoking the perception that a stimulus occurs at a location where none has been delivered, we found that hand posture has a significant effect on the perception of the illusion across the fingertips. This finding emphasizes that tactile perception arises from integration of perceived mechanical and proprioceptive input and not purely from tactile interaction with the external environment

    Assessment of predictive models for chlorophyll-a concentration of a tropical lake

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This study assesses four predictive ecological models; Fuzzy Logic (FL), Recurrent Artificial Neural Network (RANN), Hybrid Evolutionary Algorithm (HEA) and multiple linear regressions (MLR) to forecast chlorophyll- a concentration using limnological data from 2001 through 2004 of unstratified shallow, oligotrophic to mesotrophic tropical Putrajaya Lake (Malaysia). Performances of the models are assessed using Root Mean Square Error (RMSE), correlation coefficient (r), and Area under the Receiving Operating Characteristic (ROC) curve (AUC). Chlorophyll-a have been used to estimate algal biomass in aquatic ecosystem as it is common in most algae. Algal biomass indicates of the trophic status of a water body. Chlorophyll- a therefore, is an effective indicator for monitoring eutrophication which is a common problem of lakes and reservoirs all over the world. Assessments of these predictive models are necessary towards developing a reliable algorithm to estimate chlorophyll- a concentration for eutrophication management of tropical lakes.</p> <p>Results</p> <p>Same data set was used for models development and the data was divided into two sets; training and testing to avoid biasness in results. FL and RANN models were developed using parameters selected through sensitivity analysis. The selected variables were water temperature, pH, dissolved oxygen, ammonia nitrogen, nitrate nitrogen and Secchi depth. Dissolved oxygen, selected through stepwise procedure, was used to develop the MLR model. HEA model used parameters selected using genetic algorithm (GA). The selected parameters were pH, Secchi depth, dissolved oxygen and nitrate nitrogen. RMSE, r, and AUC values for MLR model were (4.60, 0.5, and 0.76), FL model were (4.49, 0.6, and 0.84), RANN model were (4.28, 0.7, and 0.79) and HEA model were (4.27, 0.7, and 0.82) respectively. Performance inconsistencies between four models in terms of performance criteria in this study resulted from the methodology used in measuring the performance. RMSE is based on the level of error of prediction whereas AUC is based on binary classification task.</p> <p>Conclusions</p> <p>Overall, HEA produced the best performance in terms of RMSE, r, and AUC values. This was followed by FL, RANN, and MLR.</p
    corecore